

## Overview • Motivation and introduction • Functional model of a memory • A simple minded test and its limitations • Fault models • March tests and their capabilities • Neighborhood tests • Summary





















### **Coupling Faults**

- Coupling Fault (CF): Transition in bit j causes unwanted change in bit i
- 2-Coupling Fault: Involves 2 cells, special case of k-Coupling Fault
  - Must restrict k cells to make practical
- Inversion and Idempotent CFs -- special cases of 2-**Coupling Faults**
- Bridging and State Coupling Faults involve any # of cells, caused by logic level
- Dynamic Coupling Fault (CFdyn) -- Read or write on j forces i to 0 or 1

### **March Test Notation**

- r0 -- Read a 0 from a memory location
- r1 -- Read a 1 from a memory location
- w0 -- Write a 0 to a memory location
- w1 -- Write a 1 to a memory location
- † -- Write a 1 to a cell containing 0
- ↓ -- Write a 0 to a cell containing 1

### March Test Notation (Continued)

- \ \ -- Complement the cell contents
- -- Increasing memory addressing
- -- Decreasing memory addressing
- -- Either increasing or decreasing

11/8/2004

### MATS+ March Test

M0: { March element (w0) }
for cell := 0 to n - 1 (or any other order) do

write 0 to A [cell];

M1: { March element (r0, w1) }

for cell := 0 to n - 1 do

read A [cell]; { Expected value = 0}

write 1 to A [cell];

**M2:** {March element  $\psi$  (r1, w0) }

for cell := n - 1 down to 0 do

read A [cell]; { Expected value = 1 }

write 0 to A [cell];

### Address Decoder Faults (ADFs)

- Address decoding error assumptions:
  - Decoder does not become sequential
  - Same behavior during both read & write
- Multiple ADFs must be tested for

| <i>A<sub>x</sub></i> •—                | ⊢• c <sub>x</sub>                           | $A_y$ $C_y$                                    | $A_x \circ C_x$ $A_y \circ C_x$               |
|----------------------------------------|---------------------------------------------|------------------------------------------------|-----------------------------------------------|
| Fault 1                                | Fault 2                                     | Fault 3                                        | Fault 4                                       |
| No Cell<br>Accessed for A <sub>X</sub> | No Address to<br>Access cell C <sub>X</sub> | Multiple Cells<br>Accessed with A <sub>y</sub> | Multiple Addresses<br>for Cell C <sub>X</sub> |

### Theorem 9.2

- A March test satisfying conditions 1 & 2 detects all address decoder faults.
- ... Means any # of read or write operations
- Before condition 1, must have wx element
  - -x can be 0 or 1, but must be consistent in test

Condition March element (rx, ..., w x) (r <del>x</del> , 2



### **Necessity Proof**

- Removing rx from Condition 1 prevents A or B fault detection when x̄ read
- Removing  $r\overline{x}$  from Condition 2 prevents A or B fault detection when x read
- Removing  $r\overline{x}$  or  $w\overline{x}$  from Condition 1 misses fault D2
- Removing rx or wx from condition 2 misses fault D3
- Removing both writes misses faults C and D1

11/8/200

### **Sufficiency Proof**

- Faults A and B: Detected by SAF test
- Fault C: Initialize memory to h (x or x̄). Subsequent
  March element that reads h and writes h̄ detects
   Fault C.
  - Marching writes  $\overline{h}$  to  $A_{v}$ . Detection: read  $A_{vv}$
  - Marching writes  $\overline{h}$  to  $A_z$ . Detection: read  $A_y$
- Fault D: Memory returns random result when multiple cells read simultaneously. Generate fault by writing  $A_x$ , Detection: read  $A_w$  or  $A_y$  ( or marches)

| Irredundant March Tests |                                                       |   |  |
|-------------------------|-------------------------------------------------------|---|--|
| Algorithm               | Description                                           |   |  |
| MATS                    | {                                                     |   |  |
| MATS+                   | {‡ (w0); <b>↑</b> (r0, w1); <b>▼</b> (r1, w0) }       |   |  |
| MATS++                  | { ↓ (w0); ↑ (r0, w1); ▼ (r1, w0, r0) }                |   |  |
| MARCH X                 | { ↓ (w0); ↑(r0, w1); ↓ (r1, w0); ↓ (r0) }             | ı |  |
| MARCH                   | {                                                     |   |  |
| C—                      | <b>♦</b> (r0, w1); <b>♦</b> (r1, w0); <b>♦</b> (r0) } |   |  |
| MARCH A                 | {                                                     |   |  |
|                         | ▼ (r1, w0, w1, w0);▼ (r0, w1, w0) }                   |   |  |
| MARCH Y                 | {                                                     |   |  |
| MARCH B                 |                                                       |   |  |
|                         | ↑(r1, w0, w1);  (r1, w0, w1, w0);                     |   |  |
|                         | <b>▼</b> (r0, w1, w0) }                               |   |  |
| 11/8/2004               | 22                                                    |   |  |

### **Irredundant March Test** Summary TF CF CF CF SCF Linked Algorithm SAF AF in id dyn **Faults** MATS Some All MATS+ All All MATS++ MARCH X MARCH C-All All MARCH A MARCH Y All All AII AII Some All All Some MARCH B All All All Some 11/8/2004













## Passive NPSF • Passive: A certain neighborhood pattern prevents the base cell from changing • Condition for detection and location: Each base cell must be written and read in state 0 and in state 1, for all deleted neighborhood pattern changes.

### Static NPSF

- Static: Base cell forced into a particular state when deleted neighborhood contains particular pattern.
- Differs from active -- need not have a transition to sensitive SNPSF
- Condition for detection and location: Apply all 0 and 1 combinations to k-cell neighborhood, and verify that each base cell was written.

11/8/2004



### Two Group Method

- Only for Type-1 neighborhoods
- Use checkerboard pattern, cell is simultaneously a base cell in group 1, and a deleted neighborhood cell in 2

11/8/2004

### RAM Tests for Layout-Related Faults

**Inductive Fault Analysis:** 

- 1 Generate defect sizes, location, layers based on fabrication line model
- 2 Place defects on layout model
- 3 Extract defective cell schematic & electrical parameters
- 4 Evaluate cell testing

11/8/2004

34

### **Memory Testing Summary**

- Multiple fault models are essential
- Combination of tests is essential:
  - March SRAM and D<u>RAM</u>
  - NPSF -- DRAM
  - DC Parametric -- Both
  - AC Parametric -- Both
- Inductive Fault Analysis is now required

11/8/2004

35

### **Summary**

- Functional and fault model of memory
  - Many fault models
- March tests and their capabilities
  - Variety of tests
- Neighborhood pattern sensitive tests
  - Varity of fault models and tests

11/8/2004



### Density and Defect Trends 1970 -- DRAM Invention (Intel) 1024 bits 1993 -- 1st 256 MBit DRAM papers 1997 -- 1st 256 MBit DRAM samples 1 \$\frac{4}{\text{/bit}}\$ --> 120 X 10^{-6} /\text{/bit}\$\$\frac{4}{\text{bit}}\$\$ Kilburn -- Ferranti Atlas computer (Manchester U.) -- Invented Virtual Memory 1997 -- Cache DRAM -- SRAM cache + DRAM now on 1 chip

### Faults • System -- Mixed electronic, electromechanical, chemical, and photonic system (MEMS technology) • Failure -- Incorrect or interrupted system behavior • Error -- Manifestation of fault in system • Fault -- Physical difference between good & bad system behavior

# Failure Mechanisms • Permanent faults: - Missing/Added Electrical Connection - Broken Component (IC mask defect or silicon-to-metal connection) - Burnt-out Chip Wire - Corroded connection between chip & package - Chip logic error (Pentium division bug)

# Fault Types • Fault types: - Permanent -- System is broken and stays broken the same way indefinitely - Transient -- Fault temporarily affects the system behavior, and then the system reverts to the good machine -- time dependency, caused by environmental condition - Intermittent -- Sometimes causes a failure, sometimes does not

### Failure Mechanisms (Continued) Transient Faults: Cosmic Ray An α particle (ionized Helium atom) Air pollution (causes wire short/open) Humidity (temporary short) Temperature (temporary logic error) Pressure (temporary wire open/short) Vibration (temporary wire open) Power Supply Fluctuation (logic error) Electromagnetic Interference (coupling) Static Electrical Discharge (change state) Ground Loop (misinterpreted logic value)

### Failure Mechanisms (Continued)

- Intermittent Faults:
  - Loose Connections
  - Aging Components (changed logic delays)
  - Hazards and Races in critical timing paths (bad design)
  - Resistor, Capacitor, Inductor variances (timing faults)
  - Physical Irregularities (narrow wire -- high resistance)
  - Electrical Noise (memory state changes)

11/8/2004

4

### Physical Failure Mechanisms

- Corrosion
- Electromigration
- Bonding Deterioration -- Au package wires interdiffuse with Al chip pads
- Ionic Contamination --  $Na^+$  diffuses through package and into FET gate oxide
- Alloying -- Al migrates from metal layers into Si substrate
- Radiation and Cosmic Rays -- 8 MeV, collides with Si lattice, generates n - p pairs, causes soft memory error

11/8/2004

Fault Modeling

- Behavioral (black-box) Model -- State machine modeling all memory content combinations --Intractable
- Functional (gray-box) Model -- Used
- Logic Gate Model -- Not used Inadequately models transistors & capacitors
- Electrical Model -- Very expensive
- · Geometrical Model -- Layout Model

11/82004 Used with Inductive Fault Analysis

Reduced Functional Model (van de Goor)

- n Memory bits, B bits/word, n/B addresses
- · Access happens when Address Latch contents change
- Low-order address bits operate column decoder, high-order operate row decoder
- read -- Precharge bit lines, then activate row
- write -- Keep driving bit lines during evaluation
- Refresh -- Read all bits in 1 row and simultaneously refresh them

**Inversion Coupling Faults (CFin)** 

- or in cell *j* inverts contents of cell *i*
- Condition: For all cells that are coupled, each should be read after a series of possible CFins may have occurred, and the # of coupled cell transitions must be odd (to prevent the CFins from masking each other).

• <|;|> and < | >

47

Good Machine State Transition
Diagram



11/8/2004



### Idempotent Coupling Faults (CFid)

- or transition in j sets cell i to 0 or 1
- Condition: For all coupled faults, each should be read after a series of possible CFids may have happened, such that the sensitized CFids do not mask each other.
- · Asymmetric: coupled cell only does or
- Symmetric coupled cell does both due to fault
- <1/8/2004>,<;1>,<;0>,<;1>



### 

## Bridging Faults • Short circuit between 2+ cells or lines • 0 or 1 state of coupling cell, rather than coupling cell transition, causes coupled cell change • Bidirectional fault -- i affects j, j affects i • AND Bridging Faults (ABF): - < 0.0 / 0.0 >, <0.1 / 0.0 >, <1.0 / 0.0 >, <1.1 / 1.1> • OR Bridging Faults (OBF): - < 0.0 / 0.0 >, <0.1 / 1.1 >, <1.0 / 1.1 >, <1.1 / 1.1> 11/8/2004













| DRAM/SRAM Fault Modeling                |            |  |  |
|-----------------------------------------|------------|--|--|
| DRAM or SRAM Faults                     | Model      |  |  |
| Shorts & opens in memory cell array     | SAF,SCF    |  |  |
| Shorts & opens in address decoder       | AF         |  |  |
| Access time failures in address decoder | Functional |  |  |
| Coupling capacitances between cells     | CF         |  |  |
| Bit line shorted to word line           | IDDQ       |  |  |
| Transistor gate shorted to channel      | IDDQ       |  |  |
| Transistor stuck-open fault             | SOF        |  |  |
| Pattern sensitive fault                 | PSF        |  |  |
| Diode-connected transistor 2 cell short |            |  |  |
| Open transistor drain                   |            |  |  |
| Gate oxide short                        |            |  |  |
| Bridging fault                          |            |  |  |
| 11/8/2004                               | 60         |  |  |











